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We describe some "unrestricted" algorithms which are useful for the computation 
of elementary and special functions when the precision required is not known in 
advance. Several general classes of algorithms are identified and illustrated by ex- 
' amples. Applications of such algorithms are mentioned. 
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1 Introduction 

Floating-point computations are usually performed with fixed precision: the machine used may 
have "single" or "double" precision floating-point hardware, or on small machines fixed-precision 
floating-point operations may be implemented by software or firmware. Most high-level lan- 
£Sj \ guages support only a small number of floating-point precisions, and those which support an 

■ arbitrary number usually demand that the precision be determinable at compile time. 

try 

We say that an algorithm has precision n if its result is computed with error 0(2~ n ). Usually we 
are interested in the relative error, but in some cases (e.g. the computation of sin(x) for x ~ ir) 
it is more appropriate to consider the absolute error. 

In certain applications it is desirable that the precision of floating-point operations should be 
able to be varied at runtime. In this paper we consider algorithms which may be used to eval- 
uate elementary and special functions to precision n, where n may be arbitrarily large. Such 
algorithms have been termed "unrestricted" by Clenshaw and Olver [13]. Note that algorithms 
which are "unrestricted" in our sense may have domain restrictions (e.g. an "unrestricted" al- 
gorithm for exp(x) might be applicable only for x > 0), although such restrictions can often be 
circumvented by the methods of Section 4, or by combining several algorithms with different 
domain restrictions. 

Unrestricted algorithms depend on the availability of variable-precision floating-point arithmetic. 
At present this is usually implemented by software, e.g. in the MP package [7], but it could be 
implemented in firmware or hardware. (Note the historical example of the IBM 1620.) 

In the following sections we ignore the possibility of floating-point underflow or overflow. For 
"ideal" variable-precision arithmetic the exponent range should tend to infinity with the 
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precision n. For most purposes, though, a whole-word exponent, as used in MP [7], is adequate 
to avoid overflow problems. 

Applications of variable-precision floating-point arithmetic include: 

1. generation and testing of accurate tables of constants (e.g. coefficients in minimax poly- 
nomial or rational approximations [9, 25]); 

2. computation using numerically unstable algorithms [2, 9]; 

3. interval arithmetic, where the final intervals may be too large if fixed-precision arithmetic 
is used [20, 28, 37]; 

4. truly machine-independent floating-point computations; 

5. testing of floating-point hardware for correctness and conformity to standards, e.g. those 
proposed in [14, 29]. 

6. number-theoretic computations where very high precision may be essential [24, 30]. 

In this paper we concentrate on unrestricted algorithms rather than their applications. 
Section 2 summarises some preliminary results. In each of Sections 3 to 9 we illustrate, by 
one or two simple examples, a useful general method leading to unrestricted algorithms. Some 
more specialised methods are mentioned in Section 10. The field is vast and we make no at- 
tempt to be comprehensive. For simplicity we usually restrict our attention to real variables and 
omit details of the rounding error analysis. We also omit any discussion of desirable high-level 
language facilities to support variable precision arithmetic, for which see [10, 19, 32]. 

Very few of the algorithms given below are new, in fact most of the identities underlying them 
may be found in [1] or [35]. What may be new is our viewpoint. Often an excellent unrestricted 
algorithm is unsuitable for fixed-precision computation, and vice versa. 

2 Basic arithmetic operations 

We assume that variable-precision floating-point numbers are represented by an integer expo- 
nent and a fraction with t digits to base f3 > 1. We call such numbers "precision n" numbers if 
n ~ (t — 1) log 2 /3. Addition and subtraction of such numbers is straight-forward, and requires 
0(n) operations [22, 23]. We assume at least one guard digit, so the relative error in the com- 
puted result is at most = 0(2~ n ) (see [36]). 

Let M(n) be the number of operations required for multiplication of precision n numbers. By 
the Schonhage-Strassen algorithm [22, 33] 

M(n) = 0(n log n log log n) . (1) 

For the moderate values of n which usually arise in applications, an efficient implementation 
of the classical 0(n 2 ) algorithm may be faster than the Schonhage-Strassen algorithm or other 
asymptotically fast algorithms. 

Let D {n) be the number of operations required for division of precision n numbers. Under 
plausible assumptions it may be shown that D (n) = 0(M(n)) (see, for example, [5]). In practice 
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the "schoolboy" algorithm, which requires 0(n 2 ) operations, may be the fastest unless n is 
rather large. 

It is important to distinguish between multiplication of two precision n numbers and multiplica- 
tion or division of a precision n number by a small (single-precision) integer. The latter require 
only 0(n) operations if implemented in the obvious way. 



3 Power series 

If f(x) is analytic in a neighbourhood of some point c, an obvious method to consider for the 
evaluation of f(x) is summation of the Taylor series 

f{x) = J>- cY f® (c)/j\ + R k (x, c) . (2) 

3=0 

As a simple but instructive example we consider the evaluation of exp(x) for \x\ < 1, using 

fc-i 

exp(x) = ^ aP/j\ + R k {x), (3) 

3=0 

where \R k (x)\ < e/k\ 

Using Stirling's approximation for k\, we see that k > K(n) ~ n/log 2 n is sufficient to ensure 
that \R k (x)\ = 0(2~ n ). Thus the time required is 0(nM(n) / \nn). 

In practice it is convenient to sum the series in the forward direction (J = 0, 1, . . . , k — 1). The 
terms Tj = x/j\ and partial sums 

i=0 

may be generated by the recurrence Tj = x x Tj-i/j, Sj = Sj-i + Tj, and the summation 
terminated when \T k \ < 2~ n . Thus, it is not necessary to estimate k in advance, as it would be 
if the series were summed by Horner's rule in the backward direction (J = k — 1, k — 2, . . . , 0). 

We now consider the effect of rounding errors, under the assumption that floating-point opera- 
tions satisfy 

fl(xopy) = (xopy)(l + S) , (4) 

where \S\ < e and "op" = "+", "x" or "/". Here e < is the "machine-precision" [36]. 
Let Tj be the computed value of Tj, etc. Thus 

\Tj-Tj\/\Tj\ < 2je + 0(e 2 ) (5) 

and 

fc 

\S k -S k \ < kee + Y,^\T 3 \+0(e 2 ) 

(6) 

< (k + 2)ee + 0{e 2 ) = 0{ne) . 

Thus, to get |Sfc — S k \ = 0(2~ n ) it is sufficient that e = 0{2r n jn), i.e. we need to work with 
about log^ n guard digits. This is not a significant overhead if (as we assume) the number of 
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digits may vary dynamically. The slightly better error bound obtainable for backward summa- 
tion is thus of no importance. 



In practice it is inefficient to keep e fixed. We can profitably reduce the working precision when 
computing I\ from T^-i if <C 1, without significantly increasing the error bound. 

It is instructive to consider the effect of relaxing our restriction that \x\ < 1. First suppose 
that x is large and positive. Since \Tj\ > \Tj_i\ when j < \x\, it is clear that the number of 
terms required in the sum (3) is at least of order |x|. Thus, the method is slow for large \x\ (see 
Section 4 for faster methods in this case). 

If \x\ is large and x is negative, the situation is even worse. From Stirling's approximation we 
have 

lm . exp \x\ 
max \Tj\ ~ / 1 1 , (7 
i>o Jl y^^lx] 

but the result is exp(— |x|), so about 2|x|/ln/3 guard digits are required to compensate for 
Lehmer's "catastrophic cancellation" [15]. Since exp(x) = 1/ exp(— x), this problem may easily 
be avoided, but the corresponding problem is not always so easily avoided for other analytic 
functions. 

In the following sections we generally ignore the effect of rounding errors, but the results obtained 
above are typical. For an example of an extremely detailed error analysis of an unrestricted al- 
gorithm, see [13]. 

To conclude this section we give a less trivial example where power series expansions are useful. 
To compute the error function 

erf (x) = 2TT' 1 / 2 ( X e"" 2 du , 
Jo 

we may use the series 
or 

oo _j 2j+l 

erf (x) = exp(-x 2 ) V . (9) 

1-3 - 5-- -(2.7 + 1) V ' 

The series (9) is preferable to (8) for moderate |x| because it involves no cancellation. For large 
|x| neither series is satisfactory, because Q(x 2 ) terms are required, and it is preferable to use the 
asymptotic expansion or continued fraction for erfc(x) = 1 — erf(x): see Sections 5 and 6. 

4 Halving identities 

In Section 3 we saw that the power series is not suitable for evaluation of exp(x) if \x\ is large. 
To reduce the size of the argument we may use the identity 

exp(x) = [exp(x/2)] 2 (10) 

as often as necessary. When applied k times, (10) gives 

exp(x) = [exp(2- fc x)] 2fc . (11) 
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If k = [en 1 / 2 ] + log 2 \x\ for some positive constant c, and (11) is used in conjunction with the 
power series algorithm of Section 3, the time required to evaluate exp(x) to precision n for large 
\x\ is 

0[(n 1/2 + In \x\)M(n)\ , 
better than the 0[(n/\nn+ \x\)M(n)] result of Section 3 (the case k = 0). 

Similar "halving" (or "doubling") identities, derived by replacing x by ix in (10), may be used 
to evaluate trigonometric and inverse trigonometric functions [5, 7, 13, 32]. 

Other identities are useful in special applications: see Section 10 for some examples. 



5 Asymptotic expansions 

Rarely does a single method suffice to evaluate a special function over its whole domain. For 
example, the exponential integral 

El ( X ) = r exp( ~ u) du (12) 

J X u 

is defined for all x ^ 0. (The Cauchy principal value is taken in (12) if x < 0.) However, the 
power series 

E 1 (x) + 7 + Ins = Y 1 (13) 

7 7 

3=1 J J 

is unsatisfactory as a means of evaluating E\{x) for large positive x, for the reasons discussed in 
Section 3 in connection with the power series for exp(x). For sufficiently large x it is preferable 
to use the asymptotic expansion [12] 

E^x) = exp(-x) £ {J ~ 1)l ^ iy 1 + Mx) , (14) 

where 

R k (x) = k\{-l) k r e -^fidu. (15) 

J X u 

For large positive x, the relative error attainable by using (14) with k ~ x is 0{x x l 2 exp(-x)), 
because 

\R k {k)\ < klk~( k+ V exp(-fe) = 0(^ 1/2 exp(-2fc)) . (16) 

Thus, the asymptotic series may be used to evaluate E\{x) to precision n when 

x > nln2 + O(lnn). (Similarly if (— x) > nln2 + O(lnn), although the estimation of \Rk(—k)\ 

is more difficult than that of \Rk(k)\.) 

There are many other examples where asymptotic expansions are useful, e.g. for erfc(x) (men- 
tioned in Section3), for Bessel functions [11, 35], etc. Asymptotic expansions aften arise when 
the convergence of series is accelerated by the Euler-Maclaurin sum formula [1]. For example, 
the Riemann zeta function ((s) is defined for R(s) > 1 by 

oo 

CM = E j~ s . ( 17 ) 

3=1 



5 



and by analytic continuation for other s / 1. (Here we allow complex s.) £(s) may be evaluated 
to any desired precision if m and p are chosen large enough in the Euler-Maclaurin formula [8] 

p— 1 i_ s m 

coo = E + l 2P' s + fn + E + > ( 18 ) 
j=i fc=i 

where 

2fc— 2 

nAs) = ^P 1 ' s - 2k Il(s+j), (19) 

|£ m , p (s)| < |T m+1)P (s) (s + 2m + l)/(a + 2m + 1)| , (20) 
m > 0, p > 1, o" = -R(s) > —(2m + 1), and the i?2fc are Bernoulli numbers. 

In arbitrary-precision computations we must be able to compute as many terms of an asymptotic 
expansion as are required to give the desired accuracy. It is easy to see that m in (18) can not 
be bounded as the precision n — > oo, else p would have to increase as an exponential function 
of n. To evaluate Q{s) from (18) to precision n in time polynomial in n, both m and p must 
tend to infinity with n. Thus, the Bernoulli numbers £?2, • • • , -62m can not be stored in a table 
of fixed size, but must be computed when needed (see Sections 7 and 9). For this reason we can 
not use asymptotic expansions when the general form of the coefficients is unknown (such as 
Stirling's formula for T(x)) in arbitrary-precision calculations. Often there is a related expansion 
with known coefficients, e.g. the asymptotic expansion for lnT(x) has coefficients related to the 
Bernoulli numbers, like (19). 



6 Continued fractions 

Sometimes continued fractions are preferable to power series or asymptotic expansions. For 
example, Euler's continued fraction [34] 

exp(x) E x {x) = 1/x + 1/1 + l/x + 2/1 + 2/x + 3/1 + ■ ■ ■ (21) 

converges for all real x > 0, and is better for computation of E\(x) than the power series 
(13) in the region where the power series suffers from catastrophic cancellation but the asymp- 
totic expansion (14) is not sufficiently accurate. Convergence of (21) is slow if x is small, so 
(21) is preferred for precision n evaluation of E\(x) only when x £ {c\n, C2n), c\ ~ 0.1, C2 ~ In 2. 



It is well known that continued fractions may be evaluated by either forward or backward 
recurrence relations. Consider the finite continued fraction 

y = 01/61 + a 2 /b 2 + ••• +a k /b k . (22) 

The backward recurrence is R k = 1, Rk-i = 

Rj = b j+1 R j+1 + a j+2 R j+2 (j = k - 2, . . . , 0) , (23) 

and y = ol\R\/Rq. The forward recurrence is Pq = 0, P± = a±, Qq = 1, Q± = b±, 



P 3 ~ h 3 ! ' i + "1 ! ' i -' 
Qj = bj Qj-l + CLj Qj-2 



(j = 2,...,k), (24) 



and y = P k /Qk- 
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The advantage of evaluating an infinite continued fraction such as (21) via the forward recurrence 
is that k need not be chosen in advance; we can stop when \Dk\ is sufficiently small, where 

Dk = ^~^- (25) 

The disadvantage of the forward recurrence is that twice as many arithmetic operations are 
required as for the backward recurrence with the same value of k. There is a simple solution to 
this dilemma if we are working with variable-precision floating-point arithmetic which is much 
more expensive than single-precision floating-point. We use the forward recurrence with single- 
precision arithmetic (scaled to avoid overflow/underflow) to estimate k, then use the backward 
recurrence with variable-precision arithmetic. One trick is needed: to evaluate using scaled 
single-precision we use the recurrence 



Dj = -ajQj^Dj-i/Qj (j = 2, 3, . . .) 



(26) 



which avoids the cancellation inherent in (25). 



In recent versions of the MP package [7] we have used the continued fraction (21) in the manner 
just described, and similar continued fractions could well be used for the computation of other 
special functions. Since power series and asymptotic series are generally easier to analyse and 
program than continued fractions, we have avoided continued fractions except where they are 
clearly superior to the other methods. 



7 Recurrence relations 

The evaluation of special functions by continued fractions is a special case of their evaluation 
by recurrence relations. For example, the Bessel functions J u (x) satisfy the recurrence relation 

J u -i{x) + J u+ i{x) = —Ju{x) (27) 
which may be evaluated backwards (compare (23)), using a normalisation condition such as 

oo 

J (x) + 2^4(x) = l. (28) 

This seems to be the most effective method in the region where Hankel's asymptotic expansion 
is insufficiently accurate but the power series 

suffers from catastrophic cancellation. For details see [17]. 

In Section 5 the constants Ck = i?2fc/(2A;)! were required, where the Bik are Bernoulli numbers. 
The Cfc are defined by the generating function 

oo 

£^ 2fc = -^ + f- (30) 

k=0 
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Multiplying both sides by e x — 1 and equating coefficients gives the recurrence relation 

Ck | Ck ~^ | | Cl = ( 3 i) 
1! 3! (2fe-l)! (2fc + l)! ' 1 ' 

which has often been used to evaluate Bernoulli numbers [21]. 

Unfortunately, forward evaluation of the recurrence (31) is numerically unstable: using precision 
n the relative error in the computed Ck is of order 4 fc 2 _ra . We shall not prove this, but shall 
indicate why such behaviour is to be expected. Consider the "homogeneous" recurrence 

l + ¥ + - + (2^=° < 32 > 

with Ci = 1, and let 

oo 

G(x) = Y, d k x 2k (33) 
fc=i 

be the generating function for the Ck- It is easy to show that 

e <*> " ss; ■ (34) 

Thus G{x) has poles at ±z7r , and 

|C fe | > ^Tr" 2 * (35) 

for some K > 0. This suggests that an error of order 2~ n in an early value of Cj propagates to 
give an (absolute) error of order 2~ n 7r~ 2k in Ck for large k. Since \Ck\ ~ (2vr) _2fc , this absolute 
error corresponds to a relative error of order 2 2k ~ n = A k 2~ n in Cfc. 

Despite its numerical instability, use of (31) may give the Ck to acceptable accuracy if they are 
only needed to generate coefficients in an Euler-Maclaurin expansion whose successive terms 
diminish by at least a factor of 4. If the Ck or B^k are required to precision n, either (31) must 
be used with sufficient guard digits, or a more stable recurrence must be used. If we multiply 
both sides of (30) by sinh(x/2)/x and equate coefficients, we get the recurrence 

Cfc-i Ci 2k 

k ^ 3! 4 (2/t-l)!4 fc - 1 (2fc + l)!4 fc 1 ' 

If (36) is used to evaluate Ck, using precision n arithmetic, the error is only 0(k 2 2~ n ). Thus, 
this method is currently used in the MP package instead of a method based on (31). 



8 Newton's method 

Newton's method and related zero-finding methods may be used to evaluate a function if we have 
an algorithm for evaluation of the inverse function. For example, applying Newton's method to 
f(x) = y — x~ m (where y is regarded as constant) gives the iteration 

Xj+i = Xj + Xj(l - x™ y)/m , (37) 

which converges (from a sufficiently good initial approximation) to y" 1 /" 1 . Note that (37) does 
not involve divisions except by the small integer m. 
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Similarly, applying Newton's method to f(x) = exp(x) — y gives the iteration 

Xj + i = Xj + y exp(-Xj) — 1 . (38) 
which converges to In y if xq is a sufficiently good initial approximation. 

Newton's method generally has second order convergence, so we may start with low precision 
and approximately double it at each iteration. Thus, the work required is of the same order 
as the work for the final iteration. Applied to (37) with m = 1 and 2, this argument shows 
that reciprocals and square roots can be found to precision n in 0(M(n)) operations. For futher 
details, and a comparison of the efficiencies of various root-finding methods for variable-precision 
computations, see [4, 5]. 



-rf dz , (40) 



9 Contour integration 

In this section we assume that facilities for variable-precision complex arithmetic are available. 
Let f(z) be holomorphic in the disc \z\ < R, R > 1, and let the power series for / be 

oo 

f(z) = £ a, z> (39) 

3=0 

From Cauchy's theorem [18] we have 

1 r f( z 

'c 

where C is the unit circle. The contour integral in (40) may be approximated numerically by 
sums 

fe-i 

S jk = - f(e 27Tim/k )e- 27Tijm/k . (41) 

From Cauchy's theorem, provided j < k and the contour C is enlarged slightly to enclose the 
k-th. roots of unity, we have 

_ 1 f f(z) 
hk Qj -™} c ^ k - (42) 

so \Sj,k — a,j\ = 0((R - 5)~( j+k ">) as k -)• oo, for any 5 > 0. 
For example, let 

as in Section 7, so a2j = B2j/(2j)\ and R = 2ir. Then 

q _ B2 i B 2 3 +k B 2j+2 k 

(2j)\ {2j + k)\ + {2j + 2k)\ + "' ' l44J 

so we can evaluate B 2 j with relative error 0((27r) _fc ) by evaluating f(z) at k points on the unit 
circle. (By symmetry and conjugacy only k/A + 1 evaluations are required if A; is a multiple of 
four.) If exp(—2irijm/k) is computed efficiently from exp(— 2iri/k) in the obvious way, the time 
required to evaluate B 2 , . . . , B 2 j to precision n is 0(jnM(n)), and the space required is 0(n). 
The recurrence relation method of Section 7 requires time only 0(j 2 n), but space 0{jn). Thus, 
the method of contour integration is recommended if space is more important than time. 



For further discussion of the contour integration method, see [26]. 
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10 Special methods 



In this section we mention two of a large number of "special" methods which are useful but less 
generally applicable than the methods of Sections 3 to 9. The first such method is the conversion 
of a power series which suffers from catastrophic cancellation to one which is better behaved 
numerically. One example, (9), has already been given. Another example occurs with 

E(x)= p-o^f^-iy- (45) 

Jo « fri Jh 

(a series encountered in Section 5). Multiplying by exp(x) and using some well known identities, 
we find 

oo 

esxp(x)E(x) = Hj xi/j\ , (46) 
i=i 

where 

B > < 47 > 

m=l 

If x is large and positive, the series in (46) is much better behaved numerically than the series in 
(45). For an application where E(x) was required to high precision with x a positive integer, see 
[11]. At first sight it appears that, in this application, the summation to precision n of k terms 
in the series (45) requires 0(kn) operations, while (46) requires Q(kM(n)) operations. How- 
ever, by a "summation by parts" trick described in [11], this can be reduced to 0{kn) operations. 

Our second "special" method is the evaluation of it and elementary functions by the arithmetic- 
geometric mean (AGM) iteration. It is well known that the AGM can be used to compute 
elliptic integrals, but perhaps less well known that it can also be used to compute it and ele- 
mentary functions, and gives the fastest known methods when the precision n is very large [4, 6]. 

The AGM of two positive numbers ao and bo is a = lim aj = lim bj, where 

0-i +bj . , . 

aj+i = " J -y^ (48) 

and 

b j + 1 = y^~b~ . (49) 

There is no essential loss of generality in assuming that ao = 1 and bo = cos (p. Gauss [16] 
showed that 2a = it /K ((f)), where 

rn/2 

K(<t>) = (1 - sin 2 </> sin 2 ^)- 1 / 2 d9 (50) 

J o 

is the complete elliptic integral of the first kind. A simple proof is given in [27]. 
The AGM iteration converges quadratically: if Ej = 1 — bj/aj then 

e j+1 = 1 - 2(1 - e,) 1/2 /(2 - ej) = e 2 /8 + 0{e)) . (51) 

Using the AGM and an identity of Lagrange, we get a family of quadratically convergent algo- 
rithms for the computation of it. The simplest of these is: 
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a := 1 ; 
b := 1/V2; 
t : = 1/4 ; 

3 ■■= i ; 

repeat 

y := a; 

a := ( a + b) /2 ; 

6 := Vbxy ; 

i := t — j x (a — y) 2 ; 

j := 2 x j 
until (a — b) < tolerance ; 
return a 2 /t. 

After k iterations the error \a 2 /t — vr| is about 87rexp(— 2 fe 7r), e.g. k = 5 gives error less than 
10" 42 . For further details see [6, 31]. 

In [3, 4, 6] it is shown how the AGM may be used to compute the elementary functions exp(x), 
ln(x), atan(x), sin(x) etc. to precision n in 0(M(n) log n) operations. The factor "logn" arises 
because O(logn) iterations of the AGM are required. It is important to note that the AGM 
iteration is not self-correcting, so the trick of starting with low precision and doubling it on each 
iteration (as used in Section 8) is not applicable. 

1 1 Summary 

Many "classical" methods may be adapted for use in variable-precision computations; others are 
not readily adaptable. Since the performance criteria are different in variable-precision appli- 
cations, the best method may be one which is not well-suited to fixed-precision computations. 
For example, it might be numerically unstable, and thus require the working precision to be 
increased. The examples given in Sections 3 to 10 above are intended to illustrate the main 
ideas of variable-precision algorithms. 
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